Extended Translation Models in Phrase-based Decoding
نویسندگان
چکیده
We propose a novel extended translation model (ETM) to counteract some problems in phrase-based translation: The lack of translation context when using singleword phrases and uncaptured dependencies beyond phrase boundaries. The ETM operates on word-level and augments the IBM models by an additional bilingual word pair and a reordering operation. Its implementation in a phrase-based decoder introduces translation and reordering dependencies for single-word phrases and dependencies across phrase boundaries. More, the model incorporates an explicit treatment of multiple and empty alignments. Its integration outperforms competitive systems that include lexical and phrase translation models as well as hierarchical reordering models on 4 language pairs significantly by +0.7% BLEU on average. Although simpler and using fewer dependencies, the ETM proves to be on par with 7-gram operation sequence models (Durrani et al., 2013b).
منابع مشابه
NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers...
متن کاملEfficient Incremental Decoding for Tree-to-String Translation
Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivat...
متن کاملInvestigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models
This work explores the application of recurrent neural network (RNN) language and translation models during phrasebased decoding. Due to their use of unbounded context, the decoder integration of RNNs is more challenging compared to the integration of feedforward neural models. In this paper, we apply approximations and use caching to enable RNN decoder integration, while requiring reasonable m...
متن کاملIntegrating Translation Memory into Phrase-Based Machine Translation during Decoding
Since statistical machine translation (SMT) and translation memory (TM) complement each other in matched and unmatched regions, integrated models are proposed in this paper to incorporate TM information into phrase-based SMT. Unlike previous multi-stage pipeline approaches, which directly merge TM result into the final output, the proposed models refer to the corresponding TM information associ...
متن کاملEnriching Phrase-Based Statistical Machine Translation with POS Information
This work presents an extension to phrasebased statistical machine translation models which incorporates linguistic knowledge, namely part-of-speech information. Scores are added to the standard phrase table which represent how the phrases correspond to their translations on the partof-speech level. We suggest two different kinds of scores. They are learned from a POS-tagged version of the para...
متن کامل